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Test Use and Test Reliability in a Curriculum 
For Educable Mentally Retarded Children^ 
I, Leon Smith^ and Sandra Greenberg 
Curriculum Research and Development 
Center in Mental Retardation 
Yeshiva University 
Working Paper No, 1 

Evaluation research is one of the deities invoked by educators to 

determine the utility of innovative approaches through the collection of hard 

data about their performance. While the process is universally praised by 

curriculum developers and researchers alike, the good works done in its name 

are remarkably few. Evaluation, in fact, is widely regarded as the least 

■* « 

satisfactory component of program development. Cuba (1969), for example, 
refers to the decades of non-significant differences that have been produced 
by the application of evaluation procedures to comparative studies of alterna- 
tives in all fields of education. Unfortunately, it is impossible to tell 
whether the absence of significant differences is a result of the failure of 
the evaluation procedures and the measures employed to detect differences or 
the inability of the programs to produce the desired effects, or both. Much 
of the -difficulty could be eliminated by initially examining the ability of 
the instruments to detect specific kinds of differences before they are put 
to use for evaluation and research purposes. Tests thai do not detect certain 
differences should not be used in curriculum applications where those differ- 
ences are being investigated. This issue is particularly crucial in curriculum 
evaluation and research where new instruments ipust be developed because avail- 
able, standardized measures are not substantively and methodologically 
appropriate. 
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The purpose of this paper is to discuss selected applications of new tests 
developed within the context of a large-scale curriculum for educable mentally 
retarded (EMR) children, namely, the Social Learning Curriculum (Goldstein, 
1969; Heiss S Mischio, 1971), and to investigate three types of reliability 
that need to be demonstrated in order to provide a basis for these applica- 
tions. The three reliability coefficients refer to differences among students, 

ft 

classrooms, and tests. ' 

Applications Based on Student Differences 

One anticipated use of the tests is the more homogeneous regrouping of 
EMR children within existing special classes for the purpose of providing more 
adequate instruction based on the Social Learning Curriculum (SLC) . This 
approach differs considerable from traditional ability grouping that forms the 
basis for much of the recent discussion concerning the adequacy of the special 
class concept (MacMillan, 1971; Dunn, 1968). Under the SLC-test grouping 
approach, EMR children would be placed together based on behaviors that are 
specifically related to the content of instruction rather than on the basis 
of IQ alone. ^ 

Although a number of potentially useful grouping algorithms are available 
for this purpose (Johnson, 1967; Cole, 1969; McQuitty, 1960, 1970), the. 
statistical properties of the procedures are such that groupings of students 
are generated regardless of the quality (reliability) of the measuring instru- 
iTients (Baker, 1972). Thus, before grouping procedures are applied, the reli- 
ability of the SLC measures with respect to differentiating students must be 
demonstrated. The use of high quality data in itself, however, provides no 
guarantee that the results will be more than 'nonsense or that the generated 
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groupings can be translated into different instructional methodologies and 
teacher behaviors. Recent evidence on this point from outside the field of 
special education suggests that the probability of obtaining meaningful groups 
is increased when task-specific achievements or measures of behavior directly 
related to the outcomes of instruction are employed as opposed to more general 
measures (G?jErn6 § Cropper, 1965) . Within the field of special education, 
Clausen (1972) reported that an attempt to define sub-groups of mental 
deficiency on the basis of constellations of basic abilities in sensory, 
motor, perceptual, and cognitive functions was not particularly successful. 
These findings, then, are consistent with the use of SLC-based tests for the 
purpose of regrouping EMR children. 

Other attempts to form groups that extend beyond the use of IQ are dis- 
cussed by MacMillan § Jones (1972), Jordan (1971), and Leland (1972). In 
these approaches, however, the variables or behaviors ^ employed to group 
students are not related to or derived from a specific curriculum. Furthermore, 
given the importance of pupil grouping to present practices within the field 
of special education, it is surprising that these approaches tend co rely on 
judgmental and impressionistic combinations of test scores and pupil character- 
istics when well developed grouping algorithms of the type referred to in this 

♦ 

paper exist . 

Another test application for which reliable student defferences need to be 
demonstrated concerns the exploration of relationships with other individual 
difference variables. The latter are likely to include variables considered 
to be more direct measures of the characteristics in question such as observed 
behaviors in classroom settings (criterion-related validity) in addition to 
other constructs such as IQ and measures of perceptual motor ability (construct 
validity). In this connection, an important validational consideration involves 
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the degree to which the methodology employed in the development of the SLC 
tests accomplishes its purpose, namely, to measure knowledge of certain social 
concepts while minimizing hypothesized deficits and difficulties associated 
with the assessment of retardates' performance. A more complete discussion 
of this issue in relation to the SLC tests is presented in the method section 
of the paper. This line of inquiry should lead to the notion of aptitude- 
assessment interactions (AAIs) as the testing analogue to the investigation 
of aptitude-treatment interactions (ATIs). The ATI position states that, 
given a common set of instructional objectives, some students will be more 
successful with one type of instruction, while other students will be more 
successful with an alternate program (Bracht, 1970). The AAI view suggests 
that, given the same set of instructional objectives, some students will 
demonstrate the behaviors more successfully on one type of test, while 
other students will be more successful in performance on an alternative 
type of test- 

Two heuristic testing models appear useful for the generatiqn of AAIs 

i 

within the context of the SLC,^ One model might attempt to compensate in 
the testing situation for learner deficits or deficiencies presumed to be 
related to test 'performance by providing thos conditions that the EMR child 
cannot supply for himself. The actual deficit or deficiency is left untouched, 
and only the debilitating effects are circumvented through the design of the 
test and/or the testing situation. As an alternate to the compensatory 
model, testing procedures can be developed to capitalize on the retardate's 
relative ability strengths. This type of model is isomorphic in the sense 
that the testing procedure is matched to one of the retardate's higher 
aptitudes or to an ability where there is no presumed deficit. Here again, 
O attempt is made to modify deficits through testing. 
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At their present level of development, the SLC tests are compensatory 
with respect to the motivational difficulties of retardates and isomorphic 
in relation to their verbal deficits. That is, the testing situation is de- 
signed to heighten -motivation, while the test itself is pictorial in nature 
and assumes that the retardate's visual abilities are better than his verbal 
abilities. Since this assumption is not likely to hold for all retardates, it 
is anticipated that alternate testing procedures will be developed. Hope- 
fully, this kind of work will produce results that can serve as the basis for 
the development of a variety of instructional strategies that parallel the 
assessment procedures. At a more general level, the investigation of AAls 
within this framework should aiso have implications concerning the distinction 
between competence and performance that has been advanced in other contests 
(Cole § Bruner, 1971; Bortner § Birch, 1970). 

A pplications Based on Classroom Differences 

One obvious use of the tests is to employ them as criterion variables 
in a complex multi -treatment or simple experimental -contrast group design in 
an effort to assess the impact of the SLC on student learning. Here, the re- 
liability of student differences is not of concern because the appropriate 
unit of analysis is noc the student but the classroom (Glass, 1967; Glass q 
Stanley, 1970; Wardiop, 1969; Raths, 19 S; Page, 1965; Wiley, 1965). The 
reason for this restb with the type of instruction provided by the SLC, or 
for that matter, any program which is not completely individualized.^ Since 
the SLC involves programming for all students in a class simultaneously, the re- 
sponses of the students within a class are not independent. Furthermore, the 
lack of independence would occur whether individual students or intact classes 
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are randomly assigned to -the different treatments (Peckham, et al., 1969; 
Glass, 1967). However, since intact classes can be expected to respond 
independently of each other, a valid analysis can be performed on the class- 
room means employing the classroom as the unit of analysis. For this 
application, then, the reliability of the tests with respect to differenti-^ 
ating classrooms must be examined. See Wiley (1970) for an extended discussion 
and critique of this position. 

Knowledge of reliable differences between classrooms on SLC tests would 
also suggest potential variability in teacher behaviors as a function of 
these differences as well as encourage attention to the effects of possible 
differences in other moments or characteristics of classroom distributions, 
particularly the variance. For example, it might be suspected that classes 
with low achievement variances may be taught by teachers who spend a dispro«- 
portionate amount of time with the more retarded, students , while classes 
with higher achievement variances may be taught by teachers who focus on 
the able students (Peckham,et al. , 1969). It is also possible that class- 
rooms with low achievement and behavior variances may be characterized by a 
teacher-directed atmosphere, while classrooms with higher variances may have 
a student -directed atmosphere (Costin, 197il). Lohnes (1972) extends this 
point and defines a classroom's syntality as the behavioral characteristics 
of the individuals comprising it, and suggests that the syntality of the 
classroom distribution should include reference not only to the mean and 
variance on a measure, but its skewness and kiirtosis as well. Thus, an 
examination of the relationship between the syntality of special classrooms 
employing the SLC and teacher behaviors might well provide a possible 
explanation of the results of other curriculum studies which reported wide 

ERLC 
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variations in the classroom behaviors of teachers using the same instructional 
materials (Gallagher, 1966; Rosenshine, 1970,1972). A more traditional, 
alternate working hypothesis might suggest that variations in the behaviors 
of teachers using SLC materials are not due to differences in the syntality of 
the classrooms but sinq)ly to differences in the teachers' characteristics, 
attitudes, and beliefs in relation to both their failure to initiate and 
maintain a high level of program implementation and the lack of an acknow- 
ledged body of pedagogy to which all teachers subscribe (Cohen, 1972). 

Applications Based on Test Differences 

For several test uses previously discussed, additional reliability 
information is needed concerning the degree to which differences can be de- 
fected among tests presumed to measure different SLC concepts. For example, 
knowledge of reliable differences among the tests would provide evidence of 
multidimensional structure a.nd suggest that grouping procedures could be 
applied using each test separately with students being reconstituted based 
on the patterning of their scores. Failure , to detect reliable differences 
among the tests would support a unidimensional view and lead to grouping 
procedures based on a total score on all of the SLC tests. This type of 
reliability also has implications for the design of curriculum evaluation 
studies; unidimensional results would indicate the use of a univariate design, 
while multidimensional findings would dictate the use of a multivariate design 
(Smith, 1972; Bock, 1966'; Baker, 1969), 
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SUMMARY 

This study investigated three types of reliability that need to be demon- 
strated in order to provide an empirical basis for the applications that have 
been discussed. Furthermore, by relating types of reliability to particular 
test uses, a clear guide is provided for the selection of appropriate reli- 
ability coefficients that is lacking in current treatments (APA, 1966; 
McGaw et al. , 1972) . 

METHOD 

The SLC and SLC-Based Tests 

The pedagogical model of the SLC is based on the expansion of the growing 
individual's world through predominantly social environments or levels, namely, 
the Self, the Home and Family, the Neighborhood, and the Community. At the 
Self Level, facts about the child logically constitute the substance of learn- 
ing. The teaching elements of the SLC at the Self Level are divided into 11 
Phases, each dealing with an array of related concepts and associated behav- 
iors. See Goldstein (1969) and Heiss § Mischio (1971) for an extended dis- 
cussion of the rationale underlying the construction of the SLC. 

The Social Learning Curriculum Survey Test (SLCST) , an experimental set 
of test items, was developed in an effort to tap samples of the conceptual 
skills contained in each of the 11 Phases (Lehrer, Heiss § Mischio, 1971). 
The testing procedures reflect the need to assess retarded children in relation 
to specific objectives of the Self Level while minimizing their verbal deficits 
and motivational difficulties which have been reported to adversely affect 
performance (House § Zeaman, 1963; Garjuoy et al., 1967; Spreen, 1965; 
r>"Liria, 1961; Green 5 Zigler, 1963; Stevenson § Zigler, 1957: Zigler, 1962). 
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The items require the student to listen to a question and respond by marking an 
''X" on one of four picture stimuli. The 11 Phase tests are prepared in separate 
booklets. In addition, there is one booklet containing practice items intended 
to provide the sbudents with training in the format of the test. During test- 
ing, all instructions are read aloud by the test administrator. Instructions 
for each item are detailed and redundant in order to compensate for the poor 
verbal skills of the respondents. No reading skills are necessary to under- 
stand the instructions and, to avoid any possible confusion, no written 
instructions are included in Lhe test booklets. For each item, the admini- 
strator holds up "the test booklet so that the students can verify the page 
they should be working on, while a proctor circulates around the room 
encouraging the students to maintain their test -taking behavior. 

For this study, the following five Phase tests of the SLCST were randomly 
selected: 

1. Recognizing Dependence. These items are intended to measure the child's 
ability to identify various authority figures in the school setting and his 
understanding of their roles . 

2. Recognizing and Reacting to Emotions. This test measures the child's 
ability to identify and differentiate various emotional states. 

3. Communicating with Others, These items examine the child's ability to 
identify different modes of communication and to understand the symbols within 
various communication modalities. The test also assesses the ability of the 
child to relate appropriate communication modes to the feelings and moods of 
others, as well as his ability to choose the appropriate communication modality 
for different situational contexts. 

4. Attaining Social Skills. These items are designed to measure the child's 
understanding of the appropriate behavioral responses to different social and 
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environmental situations. 

5. Identifying Helpers, These items examine the ability of the child to know 
when to ask for help, whom to ask for help, how to ask for help, and what to 
do when problems arise in the classroom situation. 

Two procedures were employed to reduce the original pool of SLCST items 
for each of the five selected Phases, First, items were eliminated that 
(a) did not, as originally intended, relate to the behavioral objectives of a 
particular Phase, (b) that contained vague, ambiguous picture stimuli, and 
(c) that possessed poor test characteristics (difficulty and discriminability) 
based on previous item analyses. Second, ten items per Phase were randomly 
selected from the remaining pool. 

Samples 

The subjects consisted of ten randomly selected students from each of 13 
randomly selected primary level, EMR classrooms drawn from each of two geo- 
graphical samples who have participated at various times during the last four 
years in the field testing of the SLC, See Fratkin (1972) for a complete 
description of field testing activities* The samples were selected to repre- 
sent polarities in racial, ethnic, and social class composition. Sample A 
is located in predominantly white, working class communities in northeastern 
Pennsylvania where assignment to special class placement is the responsibility 
of one central agency. Sample B is located in southwestern Florida and repre- 
sents a racially and economically heterogeneous population, including black and 
white English speaking families in addition to migrant, bilingual families • 
Assignment to special class is the responsibility of several placement agencies. 
The means and standard deviations for CA and IQ for both samples are presented 
in Table 1. 
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Insert Table 1 about here 
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Analysis of Data 
A general data layoat for the design is presented in Table 2. 



Insert Table 2 about here 



Estimates of the sources of variability needed to obtain the three relia 
abilities were generated based on a components of variance approach (Medley 
5 Mitzel, 1963; Cronbach,et_ al^. , 1963; Cronbacli^ et. al,. , in press; McGaw>et al . j 
1973; Lindquist, 1953). The procedure differs from traditional views of 
reliability in that it permits the simultaneous examination of many sources of 
variability employing an analysis of variance (ANOVA) model. An estimate of 
each component in tl: 3 design was obtained from a completely random, partially 
nested three-way ANOVA with one subject per cell. The model for the analysis 
was 

(1) Xijm = y + Ci + Sj(i) + T^n + CTim + STj (;i)m + Eijm 

where y is a general mean and Eijm is specific error. The parentheses around 
the subscripts for the student (S) dimension indicates that it was nested 
within classes (C). 

In the analysis, variability due to classes, students, and tests were 
considered systematic, while the other sources were allocated to error. This 
is consistent with the three types of reliability discussed in the first sec- 
tion of the paper. Thus, the model for the analysis can be re-written as 

\ 

(2) Xijjn = li + q + Sj(i) + Tj, + eijj. 

In terms of partitioning the variance provided by the analysis then, 
ErJc = * * * ^i' ^'hexe (4) a2 = o^^ * o^^ + 
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Since there was only one observation per cell, <^s'p+ was estimated as 
a^£g. The expected mean squares were derived through procedures suggested ^by 
Cornfield § Tukey (1956) and are presented in Table 3\ while the estimates of 
each component are contained in Table 4. 



Insert Tables 3 and 4 about here 



The analysis was performed by a computer program written by Finn (1971). 
Based on the results, three reliability coefficients were derived. The first 
provided a measure of the reliability with which special classes could be dif- 
ferentiated. This coefficient was estimated as 

(53 p^= ^ a|, where was estimated from Table 4 and a| by substi- 

tuting in equation (4) the appropriate estimates from Table 4.^ The second 
coefficient provided an index of the reliability with which students could be 
distinguished in performance. This coefficient was estimated as 

(6) a| / 02 . a| • 

Finally, the third coefficient indicated the degree to which tests measuring 
different social concepts could themselves be differentiated and was estimated 
as 

(7) = 32/52 + a| . 
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Results 

Estimates of each of the variance components as well as the three reli- 
abilities fur each sample are presented in Table 5. 

J 

Insert Table 5 about here 



The over-all findings suggest that at their present level of development, the 
test measures are not adequate for the purposes discussed. That is, with respect 
to classrooms, differences in the average levels of performance are simply not 
apparent at either sample, This Joes not minimize the possibility of demonstra- 
ting substantial differences in the variances of the classes-a separate issue 
which will not be considered here. Although student differences appear to be the 
largest source of systematic variation at both samples, the reliability of 
these differences does not reach acceptable levels. Regarding the tests at 
the Self Level, the evidence suggests that a unidimensional view of the total 
score on the 5 phases is much more tenable than a multidimensional view. 
Finally, the data on the variance components from the two separately sampled 
areas appear quite comparable despite known differences in the racial, ethnic, 
and social class compositions of the samples. 

Discussion 

The failure to identify adequate between-class, -student, and -test differ- 
ences may in part be due to the rationale underlying the development of the 
measures at the Self Level of the SLC. The major purpose was to construct 
items that tap objectives of the curriculum as opposed to items that simply 
discriminate. While heightened discriminability can certainly be achieved 
by deleting certain items and adding others, the procedure reduces the rele- 
vance of the items to specific curriculum objectives. Thus, the psychometric 
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criterion of test discriminability appears to be incompatable with the con- 
struction of items that are intended to assess content objectives (Husek,1969; 
Tyler, 1966). At the same time, there does not seem to be much value in 
constructing curriculum-based measures for the purposes previously discussed 
that do not adequately distinguish the two units of analysis (students and 
classes). For example, the issue of the differential grouping of EMR children 
for the purpose of SLC instruction seems to require the very kind of discrimi- 
nation that the data do not support, and perhaps, can never support based on 
the above procedures for developing and selecting items. It would appear, 
then, that different item selection procedures are needed depending on the 
test application. Two approaches warrant consideration. First, for those 
uses excluding the evaluation of the effectiveness of SLC instruction, the 
most reasonable model is one in which the items are sensitive to individual 
differences in the units of analysis. Wide latitude should be permitted in 
the development and selection of the items to insure adequate discriminability. 
In addition, the notion of social learning should be considered as a generic 
construct that is not limited to but extends beyond the specific objectives 
of the SLC* This approach requires the use of traditional, normative 
referenced test criteria for which the statistical analysis conducted in the 
present study is most appropriate. 

Second, for the purpose of evaluating the outcomes of SLC instruction, the 
items should be sensitive to differences in instructional emphasis (Gleser, 1963; 
Hammock, 1960; Roudabush, 1973). For this use, item selection should conform as 
closely as possible to the specific objectives of the curriculum. The criteria 
employed here require a minimum two-stage tryout of items, that is, a pre- 
instruction administration of the items, followed by SLC instruction, then a 

O 
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post -instruction administration of the items to the same students. For this 
use, it would be desirable to retain items which were responded to correctly 
by all those following instruction but were answered incorrectly prior to 
instruction. Since each approach follows different item selection procedures 
and different types of analysis, uniquely different tests are likely to be 
constructed. Recent evidence suggests that less than half the items selected 
by the normative criteria were selected based on the instructional criteria 
(Roudabush, 1973). The moral of this study appears to rest with the fact 
that item selection based primarily on the instructional model will not 
necessarily meet normative test criteria. 

Superimposed on these considerations is the issue of the unit of analysis. 
If the present data are any relative indication, it will be far easier to 
differentiate EMR children than EMR classes at the same site under both item 
selection procedures. Furthermore, the items that discriminate individuals 
may not be the same ones that discriminate groups or classes (Lewy, 1973). 
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Footnotes 

^ The preparation of this papei> was supported by a grant from the U.S. 
Office of Education^ Bureau for the Education of the Handicapped, Project 
#6-1368. 

^ The authors wish to thank Herbert Goldstein, Barry Lehrer, and Gregory Schim- 
oler for their helpful comments, and Carol Sternberg for her assistance in the 
data analysis. 

^ The idea of curriculum-based or instructional grouping discussed here could 
also be extended to include Thelen's (1967) notion of teachability grouping. 
This approach would require teachers to identify those students who did well in 
the class as well as those students who did nctt. The students would then be 
tested on variables relevant to effective cla5sroom behavior, including the 
SLC-based tests. The responses that differen :iate the two groups identified by 
the teacher could be made into a scoring key to be used with next year's classes. 
The teacher's "teachable" class, then, would be composed of high scorers using 
the compatability discriminating key. 

^ See Salomon (1972) for an extended discussion of the ATI analogues. 

^ For a discussion of instructional issues related to the differences between 
the student and the classroom as th« unit of analysis see Thelen (1969) and 
Lindvall § Cox (1969). 

^ The estimate of classroom variation (5^) also contained variability, if any, 
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due to differences among teachers and schools. Separate estimates of the three 
components can only bie obtained in a design that samples at least two classrooms 
for each of at least two teachers at each of at least two schools. The design 
was impossible to implement in the present study since each teacher spends the 
entire day with the same self-contained special class, that typically being the 
only EMR class in the school; 
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Table 1 
Sample Characteristics 

Measure Sample A '^-«=««?*^'*1^^ple B 

Chronological Age X=10.2 x=9.5 

SD = 2.1 SD = 1.9 

I Q ^ X = 70.7 X = 66.4 

SD = 8.5 SD = 7.2 



estimates are based on the WISC or Stanford- Binet . 
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Table 2 
Data Layout for Design 

Factor C -Classes i = 1, 2, ... 13 

Factor S (within C) -Students j = 1, 2, ... 10 
Factor T -Tests m = 1, 2, ... 5 





Cl 




Ci 




Cl3 


Sii.Sjj.Sio 1 




^li'^ji'^10 i 




^1 13-Sj 13-^10 13 


























Tm 






































I 
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Table 3 
Expected Mean Squares 



Source E(MS) 



Systematic 



Classes (C) + + lOa^ ^ so^ + SOa^ 

Students 

Within + + Sa^ 



Class (S) 



ST S 



Tests (T) 02 + + lOa^ + 1300a^ 

ST CT T 



Error (a|) 



C X T + 02 + 1002 

ST CT 



Residual 0^ + 0^ 

ST 
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Error (o^) 



Table 4 
Variance Components 



Systematic Components of Variance 



52 = 1/50 ( MS^ - MSg - MS^^ . MS^^g ) 



1/5 ( MS, - MS„,^ ) 
^ & RES 



52 = 1/130 ( MS^ - MS^^ ) 



= 1/10 ( MS^^ - MS„^^ ) 
CT ^ CT RES 



52 = MS 

RES RES 
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Table 5 
ANOVA for Components 



Source 


df 


Sample A 


Sample B 


MS 


'•0 


P 


MS 




P 


Classes (C) 


12 


11. S 


.03 


.02 


17.3 


.18 


.10 


Students (S) 


117 


7.3 


1.2 


.43 


1.1 


1.3 


.44 


Tests (T) 


4 


141.9 


1.1 


.41 


66.1 


.5 


.20 


C X T 


48 


4.3 


.3 




2.1 


.06 




Residual 


468 


1.3 


1.3 




1.5 


1.5 
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